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In humans, copy number variations (CNVs) are a com- 
mon source of phenotypic diversity and disease suscepti- 
bility. Facioscapulohumeral muscular dystrophy (FSHD) is 
an important genetic disease caused by CNVs. It is an 
autosomal-dominant myopathy caused by a reduction in 
the copy number of the D4Z4 macrosatellite repeat lo- 
cated at chromosome 4q35. Interestingly, the reduction of 
D4Z4 copy number is not sufficient by itself to cause FSHD. 
A number of epigenetic events appear to affect the sever- 
ity of the disease, its rate of progression, and the distribu- 
tion of muscle weakness. Indeed, recent findings suggest 
that virtually all levels of epigenetic regulation, from DNA 
methylation to higher order chromosomal architecture, 
are altered at the disease locus, causing the de-regulation 
of 4q35 gene expression and ultimately FSHD. 



Copy number variations are an important 
source of human genetic diversity 

Genetic association studies generally evaluate single-nucleotide 
polymorphisms (SNPs), which are single nucleotides at specific 
genomic locations that vary between individuals of the same 
species. Recent results indicate that the human genome contains 
another frequent type of polymorphism: copy number varia- 
tions (CNVs; Conrad et al., 2010). A CNV is a segment of DNA 
that can be found in various copy numbers in the genomes of 
different individuals (Fig. 1). CNVs range in size from a few 
hundred nucleotides to several megabases. Compared with SNPs, 
CNVs affect a more significant fraction of the genome and 
arise more frequently. Hence, CNVs significantly contribute to 
human evolution, genetic diversity, and an increasing number of 
phenotypic traits (Stankiewicz and Lupski, 2010). 
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Depending on the genomic context, CNVs can have vary- 
ing effects (Fig. 1). For example, recent data indicate that CNVs 
directly alter the structure of 12.5% of protein-coding genes 
(Conrad et al, 2010), and there is increasing evidence to suggest 
that CNVs play an important role in a number of Mendelian dis- 
eases and common complex disorders (Stankiewicz and Lupski, 
2010). For example, Charcot-Marie-Tooth type 1A disease is 
caused by duplications in the PMP22 gene, which encodes an inte- 
gral membrane protein that is a major component of compact my- 
elin in the peripheral nervous system (Chance et al., 1994). Also, 
susceptibility to acquired immune deficiency syndrome is affected 
by segmental duplications encompassing the CCL3L1 gene, 
which encodes the CCR5 chemokine and ligand for the human 
immunodeficiency virus coreceptor (Gonzalez et al., 2005). 

CNVs can also affect gene dosage and expression (Stranger 
et al., 2007). Recent data indicate that non-B-DNA forming se- 
quences, which are usually enriched in promoter regions, are 
also enriched in CNV breakpoints. Thus, the same features that 
are involved in transcriptional regulation may also be involved 
in the formation of CNVs. As a consequence, CNVs might shape 
the evolution of gene regulation (Conrad et al., 2010). 

More than half of the human genome is comprised of repeti- 
tive sequences (Neguembor and Gabellini, 2010). Because repeti- 
tive sequences can act as substrates for homologous recombination, 
their presence facilitates the instability of our genome (Gu et al., 
2008). As a result, repetitive sequences account for a significant 
amount of human CNVs (Warburton et al., 2008). 

In this review we will focus on an important human ge- 
netic disease, facioscapulohumeral muscular dystrophy (FSHD), 
which is caused by the presence of CNVs of a repetitive sequence 
that regulates gene expression. 

Clinical features of FSHD 

FSHD (MIM #158900) is characterized by the progressive 
weakness and atrophy of a specific subset of skeletal muscles. 
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□ INDIVIDUAL 2: three copies of the element as result of a 
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□ INDIVIDUAL 3: one copy of the element as result of a 
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Figure l . CNVs and their possible effects on gene expression. CNVs can generate different outcomes based on the nature of the affected element. If a 
CNV encompasses an entire gene(s) locus, the dosage of the gene may be altered, leading to gene amplification or deletion. Alternatively, the affected 
region can encompass only part of a gene. In this case, if the CNV includes an exon, it can result in the production of an aberrant protein isoform. When 
a CNV localizes to a nonprotein coding region of the gene it may generate imbalances at the level of splicing or noncoding RNA (ncRNA) production. 
In addition, extragenic CNVs can give rise to altered gene expression when affecting cis regulatory regions. 



As the name implies, FSHD mostly affects the muscles of the 
face, scapula, and upper arms (Tawil et al., 1998). The peculiar 
involvement of specific muscles is such a striking feature of 
FSHD that it is often used in the clinic to distinguish FSHD 
from the other forms of muscular dystrophy (Padberg et al., 
1991). FSHD onset generally involves the wasting of facial 
muscles, such as orbicularis oris and orbicularis oculi, whereas 
others, like the pharyngeal and lingual muscles, are unaffected. 
As the disease progresses, limb girdle muscles, such as scapula 
fixator and trapezius, are also affected. Abdominal muscle weak- 
ness is another feature of FSHD, causing a characteristic lordotic 
posture (an abnormal curvature of the lumbar spine) associated 
with a protuberant abdomen. In the most severe cases, the 
muscular degeneration can extend to the pelvic girdle and foot 
dorsiflexor muscles, thereby affecting the ability of the patient 
to walk. Approximately 20% of FSHD patients become wheel- 
chair bound (Pandya et al., 2008). 

FSHD is associated with retinal vasculopathy, a blood vessel 
disorder of the retina, in 60% of cases (Tawil and Van Der Maarel, 
2006) and sensorineural hearing loss in 75% of affected indi- 
viduals (Trevisan et al., 2008). Mental retardation, epilepsy, and 
cardiac involvement are also present in FSHD patients more fre- 
quently than in healthy people (Faustmann et al., 1996; Funakoshi 
et al., 1998; Trevisan et al., 2006, 2008; Saito et al., 2007). 

Most FSHD patients report their first symptoms during 
the second or third decade of their life; however, the age of 



onset can vary from infancy to age 50 (van der Maarel et al., 
2007). Early-onset cases are generally associated with more se- 
vere phenotypes (Miura et al., 1998; Klinge et al., 2006). Inter- 
estingly, the FSHD phenotype is gender dependent. Typically, 
males are more severely affected, whereas females can develop 
a milder or asymptomatic form of the disease (Padberg, 1982; 
Zatz et al., 1998; Ricci et al, 1999; Tonini et al., 2004). 

Muscle impairment in FSHD is often asymmetric: mus- 
cles on one side of the body appear much more compromised 
than on the other side (Kilmer et al., 1995). Various hypotheses 
have been proposed to explain this phenomenon, including over- 
work weakness and handedness, but the mechanism underlying 
this asymmetric phenotype in FSHD remains unknown (Pandya 
et al., 2008). 

The rate of FSHD progression and the distribution of 
muscle weakness are highly variable, even between close family 
relatives. Indeed, these features were noted in the first study 
conducted on the disease in the late 1800s (Landoyzy and 
Dejerine, 1886). A number of monozygotic twins discordant for 
the penetrance of FSHD have been described, pointing to a 
strong epigenetic component in the disease (Tawil et al., 1993; 
Griggs et al, 1995; Hsu et al., 1997; Tupler et al., 1998). 

Although the genetic defect underlying FSHD has been 
identified, the molecular mechanism causing the disease re- 
mains unclear. Recent results suggest that complex genetic 
events contribute to FSHD, as discussed below. 
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Figure 2. FSHD is caused by a reduction in the copy number of D4Z4 repeats. (A) Healthy individuals carry 1 1-150 units of D4Z4. (B) FSHD patients 
have less than 1 1 repeats. (C) At least one copy of D4Z4 is required for FSHD development, as individuals completely lacking D4Z4 are healthy. 
(D) Patients having a large deletion encompassing FRG2 and DUX4c have been described, indicating that these genes are not necessary for FSHD. Dotted 
lines indicate that distances are not to scale. 



FSHD genetics 

FSHD is the third most common myopathy, with an incidence 
in the general population of 1:15,000 (Flanigan et al., 2001). 
The disease is transmitted as an autosomal-dominant character, 
although it presents very complex genetics. Up to 30% of cases 
are due to de novo mutations (Zatz et al., 1995). Approximately 
half of the de novo cases result from a post-zygotic mutation 
that leads to mosaicism (Griggs et al., 1993; Weiffenbach et al., 
1993; Upadhyaya et al., 1995; Bakker et al., 1996; van der 
Maarel et al., 2000). In more than 95% of cases, the disease 
maps to the subtelomeric region of chromosome 4 long arm at 
4q35.2 (FSHD1 ; Wijmenga et al., 1990, 1991).Asmall percent- 
age of FSHD cases are not genetically linked to chromosome 4 
(FSHD2; Wijmenga et al., 1991; Gilbert et al., 1992; Bakker 
et al., 1995), although no other putative genetic locus has 
been identified. 

In 4q-associated FSHD cases, the disease is caused by a 
molecular rearrangement that causes copy number variations of 
a 3.3-kb tandem repeated macrosatellite called D4Z4 (Wijmenga 
et al, 1992; van Deutekom et al, 1993). D4Z4 is extremely 
polymorphic in the general population (Hewitt et al., 1994; 
Winokur et al., 1994), ranging from 11 to 150 copies. FSHD 
patients carry only 1 to 10 units (Fig. 2; Wijmenga et al., 1992; 
van Deutekom et al., 1993). 

Although FSHD is highly variable, there is a general cor- 
relation between the number of residual D4Z4 repeats, the age 
of onset, and the severity of the disease (Goto et al., 1995; Lunt 
et al., 1995; Zatz et al., 1995; Tawil et al., 1996; Hsu et al., 
1997; Ricci et al., 1999). In particular, larger deletions tend to 
be associated with earlier onset and a more rapid progression 
of FSHD (Goto et al., 1995; Lunt et al., 1995; Zatz et al., 1995; 
Tawil et al., 1996; Hsu et al., 1997; Ricci et al., 1999). Impor- 
tantly, it has been reported that at least one copy of D4Z4 is re- 
quired to cause FSHD, as individuals with deletions of the entire 
repeat array do not display signs of muscular dystrophy (Fig. 2; 
Goto et al., 1995; Tupler et al., 1996; Rossi et al., 2007), sug- 
gesting that the repeat itself plays a critical role in the disease. 



A detailed genomic characterization of the 4q35 region led 
to the identification of different haplotypes (Fig. 3; van Geel 
et al., 2002; Lemmers et al., 2007, 2010a). A simple sequence 
length polymorphism is localized 3.5 kb proximal to D4Z4. 
D4F104S1 (pl3E-ll), aregion located immediately proximal to 
D4Z4, contains 15 SNPs. The most proximal unit of the D4Z4 
repeat array contains several SNPs. Finally, a large region of 
sequence variation (alleles A, B, or C) has been detected distal 
to D4Z4 (Fig. 3). Considering these various features, 4q alleles 
were subdivided in 1 8 haplotype variants (Lemmers et al., 2010a). 
Importantly, D4Z4 deletions are pathogenic only in a few of 
these haplotype backgrounds (4qA161, 4qA159, and 4qA168; 
Lemmers et al., 2007, 2010b). D4Z4 deletions in the presence 
of these haplotypes are not sufficient to cause FSHD because 
4qA161 asymptomatic carriers have been described (Arashiro 
et al., 2009), suggesting that these haplotypes represent only a 
permissive condition for FSHD rather than being the causative 
event. Importantly, it was observed that FSHD2 patients carry at 
least one 4qA161 allele (de Greef et al., 2009), further supporting 
the role of this permissive haplotype in the disease. 

There are sequences homologous to D4Z4 on several 
human chromosomes (Lyle et al., 1995). On chromosome 10q26 
there is a repeat array that shares 98% identity with the D4Z4 
repeat array at 4q35 (Cacurri et al., 1998). Additionally, high 
homology extends to 45 kb proximal of D4Z4 and 15-25 kb 
distal (van Geel et al., 2002). The lOq and 4q D4Z4 repeats are 
equally polymorphic, and some individuals have nonstandard, 
hybrid alleles containing 4q-derived repeats on chromosome 10 
and lOq repeats on chromosome 4 (van Deutekom et al., 1996a; 
van Overveld et al., 2000; Lemmers et al., 2010a). Several stud- 
ies have reported that the D4Z4 repeats (even if 4q derived) on 
lOq are not pathogenic, suggesting that 4q-specific sequences 
proximal to D4Z4 are required for FSHD (Bakker et al., 1995; 
Deidda et al., 1996; van Deutekom et al., 1996a; Lemmers 
et al., 1998; van Overveld et al., 2000). By contrast, an excep- 
tion to this rule was recently described (Lemmers et al., 2010b). 
In this unusual case, only the distal end of the D4Z4 repeat array 
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Figure 3. SNPs and sequence variants at 4q35. 18 different 4q haplotypes have been described. FSHD patients carry D4Z4 deletions in 4qA161, 
4qAl 59, and 4qAl 68 backgrounds. These genetic contexts represent a permissive condition for the disease rather than a cause given that asymptomatic 
carriers have been described. SSLP, simple sequence length polymorphism; TEL, telomere. 



was transferred to chromosome 10. Thus, the 4q35 FSHD can- 
didate genes located proximal to the D4Z4 repeat amy were 
not present on chromosome 10. This finding suggested that 
proximal 4q genes are not required for the pathogenesis of 
FSHD (Lemmers et al., 2010b). It has to be noted, however, that 
this case is a very unusual patient carrying a rare haplotype with 
hybrid repeats deleted on lOq chromosome and a permissive 
4qA161 allele on 4q chromosome (Lemmers et al., 2010b). 
Hence, in this case the disease could still be linked to chromo- 
some 4 through an in trans effect of the hybrid lOq repeats on 
the permissive chromosome 4q. 

Although the exact molecular mechanism responsible for 
the disease is unknown, it is agreed in the field that the D4Z4 
deletion causes an epigenetic gain-of-function alteration lead- 
ing to the up-regulation of candidate gene(s) (Neguembor and 
Gabellini, 2010). 

Epigenetic features associated with F5HD 

Several of the clinical features outlined above, such as the gen- 
der bias in severity, the asymmetric muscle wasting, and the dis- 
cordance in monozygotic twins, suggest that FSHD development 
involves epigenetic factors (Neguembor and Gabellini, 2010). 

Epigenetic changes do not affect the primary DNA se- 
quence; rather, gene expression is altered by changing the con- 
formation of chromatin. Local chromatin structure is regulated 
by at least three processes: DNA methylation (Suzuki and Bird, 
2008), histone modifications (Ruthenburg et al., 2007), and 
ATP-dependent chromatin remodeling (Ho and Crabtree, 2010). 
In addition, a number of elements (chromatin boundaries, insu- 
lators, etc.) affect higher order chromatin structure by regulating 



long distance interactions and chromatin loop domain organiza- 
tion (Maeda and Karch, 2007). Here, we summarize the studies 
that have investigated the epigenetic factors involved in FSHD. 

□IMA methylation in FSHD. D4Z4 belongs to a 
family of human tandem repeats termed macrosatellites that are 
noncentromerically located (Chadwick, 2009). Together with 
other members of the family, such as DXZ4 on chromosome X 
(Giacalone et al., 1992) and RS447 on 4p (Kogi et al., 1997), 
D4Z4 is extremely GC rich. 

DNA methylation is a chemical mark added to cytosine 
residues by DNA methyltransferases: DNMT1, DNMT3A, and 
DNMT3B (Chen and Li, 2004). Mammalian genomes are glob- 
ally methylated, with the noticeable exception of short non- 
methylated regions called CpG islands. Current data indicate 
that promoter methylation leads to stable gene silencing, whereas 
intragenic methylation helps to weaken transcriptional noise 
(Suzuki and Bird, 2008). In addition, because several transcrip- 
tion factors and chromatin-binding proteins, such as CTCF and 
YY1, are methylation sensitive (Hark et al., 2000; Kim et al., 
2003), it is clear that DNA methylation can significantly affect 
the occupancy of a specific genomic region. 

It has been shown that although D4Z4 is highly methyl- 
ated in healthy subjects, FSHD patients have a specific hypo- 
methylation of the D4Z4 contracted allele (Fig. 4; van Overveld 
et al., 2003; de Greef et al., 2009). Recent findings indicate that 
D4Z4 contraction is always associated with hypomethylation, 
irrespective of the chromosome or the haplotype, as deletions 
on chromosome 10 and on chromosome 4 in asymptomatic car- 
riers are also associated with a reduction in DNA methylation of 
the contracted locus (de Greef et al., 2009). Moreover, D4Z4 is 
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Figure 4. Epigenetic features at 4q35 in healthy subjects and FSHD patients. In control individuals, the D4Z4 repeat array is characterized by markers of 
chromatin repression, such as high levels of DNA methylation, SUV39H I -mediated H3K9me3, and EZH2-mediated H3K27me3. In contrast, FSHD patients 
display hypomethylation, loss of H3K9me3, and corresponding loss of HPl-y and cohesin binding. In healthy subjects, D4Z4 is specifically bound by EZH2 
and a repressor complex composed of YYl , HMGB2, and nucleolin. It is still to be determined whether binding of these factors is altered in FSHD patients. 
The human 4q is perinuclear in both control and FSHD individuals. This localization depends on a region that is proximal to D4Z4 in healthy subjects but 
is D4Z4-specific and CTCF-mediated in FSHD patients. A MAR located upstream of the repeat array was found to be weakened in FSHD, altering the 3D 
chromosomal architecture of the region. 



hypomethylated in patients affected by immunodeficiency, cen- 
tromeric instability, and facial anomalies syndrome (Kondo 
et al., 2000). Interestingly, there are no common traits among 
these diseases. These findings suggest that D4Z4 hypomethyl- 
ation is not responsible for FSHD by itself. Nevertheless, it 
is interesting to note that non-4q-associated FSHD patients 
(FSHD2), which lack D4Z4 contractions, display general D4Z4 
hypomethylation on both chromosome 4 alleles and on the two 
chromosome 10 alleles (de Greef et al., 2009), pointing to a 
general defect in methylation of D4Z4 repeats. Collectively, 
these results suggest that D4Z4 hypomethylation might repre- 
sent a permissive condition required for FSHD onset or that it 
might be a consequence of the primary cause of FSHD. 

Histone modifications in FSHD. Most cellular DNA 
is compacted into nucleosomes, in which 146 bp of DNA are 
wrapped around a protein octamer composed of two copies of 
each core histone H2A, H2B, H3, and H4 (Campos and Reinberg, 
2009). Nucleosomes are linked by a variable length of DNA 
associated with linker histone HI (Campos and Reinberg, 2009). 
Histone proteins are subjected to different posttranslational 
covalent modifications, including acetylation, methylation, 



ubiquitination, and SUMOylation of lysine (K) residues, 
phosphorylation of serine (S) and threonine (T) residues, methyl- 
ation of arginines (R), and ADP-ribosylation of glutamic acid 
(Bernstein et al., 2007). Combinations of posttranslational 
modifications of single histones, single nucleosomes, and 
nucleosomal domains establish local and global patterns of 
chromatin modifications and recruit nuclear factors that medi- 
ate downstream functions (Ruthenburg et al., 2007). These pat- 
terns can be altered by multiple extracellular and intracellular 
stimuli, and chromatin itself functions as a genomic integrator 
of various signaling pathways, ultimately affecting cellular 
processes such as replication and transcription (Cheung et al., 
2000; Nightingale et al, 2006). 

The D4Z4 repeat array appears to be organized in distinct 
domains, some characterized by transcriptionally repressive 
heterochromatin and others by transcriptionally permissive 
euchromatin (Zeng et al., 2009). In particular, on both chromo- 
some 4 and 10, the repressive marks of histone H3 lysine 9 
tri-methylation (H3K9me3) and histone H3 lysine 27 tri- 
methylation (H3K27me3) are both present on some D4Z4 units, 
but the permissive mark histone H3 lysine 4 di-methylation is 
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present on different units (Zeng et al., 2009). Consistent with 
previous studies, the authors reported euchromatin histone marks 
in the first proximal D4Z4 unit of the array (Jiang et al., 2003; 
Zeng et al., 2009). 

The modification of H3K9me3 on D4Z4 is mediated by 
the histone methyltransferase SUV39H1 (Zeng et al., 2009). 
Interestingly, H3K9me3 is lost in FSHD patients, preventing 
the binding of D4Z4 to the heterochromatin-binding protein 
HPl-y and the sister chromatid cohesion complex, cohesin (Fig. 4). 
This loss could lead to the de-repression of 4q35 genes and 
muscular dystrophy (Zeng et al., 2009). 

In a study aimed at characterizing the chromatin status of 
the FSHD region, both D4Z4 and the promoter of the 4q35 gene 
FRG1 (see below) were reported to be bound by the transcrip- 
tion factor YY1 and the Polycomb Group protein EZH2 (Bodega 
et al., 2009). Polycomb Group proteins are chromatin modifiers 
that implement transcriptional silencing in higher eukaryotes 
(Simon and Kingston, 2009). In particular, YY1 and EZH2 
binding are reduced both at D4Z4 and FRG1 promoter in myo- 
tubes compared with myoblasts (Bodega et al., 2009). As a 
consequence, the EZH2-mediated histone repressive mark 
H3K27me3 is also reduced in myotubes compared with myo- 
blasts. Accordingly, FRG1 expression is increased in myotubes 
compared with myoblasts. Notably, the H3K27me3 modifica- 
tion at D4Z4 was found by 3D FISH to be less abundant in 
FSHD cells compared with controls (Bodega et al., 2009). 

In summary, a number of repressive histone marks are 
present on D4Z4 in healthy subjects and their loss in FSHD 
might lead to de-repression of 4q35 genes. 

A repressor complex binds to D4Z4. A few years 
ago, a 27-bp sequence located inside each D4Z4 unit, termed the 
D4Z4 binding element, was identified (Gabellini et al., 2002). 
This element is specifically bound by a D4Z4 repressor com- 
plex (DRC) composed of YY1, HMGB2, and nucleolin (Gabellini 
et al., 2002). These factors interact with proteins that mediate 
gene silencing and heterochromatin formation, such as DNA 
methyltransferases, histone deacetylases, and HP1 (Ko et al., 
2008; Wu et al., 2009). DRC binds to the 4q35 located D4Z4 
in vivo and mediates the transcriptional repression of 4q35 genes 
(Gabellini et al., 2002). Thus, the loss of D4Z4 repeats in FSHD 
may result in reduced DRC binding to the region and, conse- 
quently, reduced silencing of the 4q35 genes (Fig. 4; Gabellini 
et al., 2004). Importantly, of the factors that have been identified 
to bind D4Z4, YY1 is the only one with sequence specificity. 
Thus, it would be interesting to investigate whether the recruit- 
ment of factors like SUV39H1, EZH2, HPI7, and cohesin to 
D4Z4 is mediated by YY1 (Fig. 4). 

Subnuclear localization of 4q35. Most nuclear 
events do not occur randomly throughout the nucleoplasm; 
rather, they are usually limited to specific and spatially defined 
sites (Ferrai et al., 2010). Accordingly, the particular intra- 
nuclear positioning of a given chromosomal region plays an 
important role in several cellular processes, such as transcrip- 
tion and replication (Spector, 2001). 

Although mammalian telomeres in somatic cells are evenly 
dispersed in the inner part of the nucleus (Luderus et al., 1996; 
Nagele et al., 2001; Amrichova et al., 2003; Weierich et al, 2003), 



the 4q telomere is located near the nuclear periphery (Masny 
et al., 2004; Tarn et al., 2004). Interestingly, a sequence 215 kb 
proximal to the repeat array shows a stronger localization to the 
nuclear rim than D4Z4 in healthy subjects, suggesting that a re- 
gion proximal to D4Z4, and not the repeat array itself, directs 
the 4q telomere to the periphery (Fig. 4; Masny et al., 2004). 
Recently, Ottaviani et al. (2009b) identified an 80-bp sequence 
inside the D4Z4 unit that can trigger perinuclear positioning of 
artificial telomeres in a CTCF- and lamin A-dependent manner 
(see below). This property is lost upon D4Z4 multimerization. 
Thus, it appears that in healthy subjects, multiple copies of 
D4Z4 are located near the nuclear periphery due to a 4q-specific 
signal proximal to D4Z4, whereas in FSHD patients the peri- 
nuclear location is mediated by D4Z4 (Fig. 4; Ottaviani et al., 
2009b). Although FISH analyses indicate that the peripheral 
localization of 4q is maintained in different cell types and is 
apparently unaltered in FSHD patients compared with controls, 
the peripheral environment of the FSHD 4q35 allele may be 
altered, and thereby contribute to the aberrant 4q35 gene ex- 
pression reported in FSHD (Masny et al., 2004; Tarn et al, 2004; 
Ottaviani et al., 2009b). 

In metazoans, the nuclear lamina coats the inner surface of 
the nuclear envelope (Hetzer, 2010). Using the DNA adenine 
methyltransferase identification approach, lamin-associated do- 
mains that correlate with silenced regions have been identified 
(Guelen et al., 2008). Intriguingly, a lamin-associated domain has 
been mapped to a locus that is 50 kb proximal to D4Z4, which is 
consistent with the previous finding that this region has a role in 
maintaining gene repression (Guelen et al., 2008). The peripheral 
location of 4q seems to be strictly dependent on lamin A, given 
that chromosome 4 telomeres are dispersed in cells lacking the 
lamin A gene (Masny et al., 2004). Furthermore, chromatin 
immunoprecipitation assays revealed that lamin A is associated 
with D4Z4 in vivo (Ottaviani et al., 2009a). 

Altered chromatin organization at 4q35 in 
FSHD. As mentioned above, there is no macroscopic relocal- 
ization of 4q due to D4Z4 deletion in FSHD. Nonetheless, 
more subtle alterations may occur. For example, the 4q35 locus 
could be repositioned to a different peripheral subdomain, lead- 
ing to inappropriate 4q35 gene regulation (Masny et al., 2004; 
Ottaviani et al., 2009b). Alternatively, the higher order chroma- 
tin structure of the 4q35 locus might be affected. There is grow- 
ing evidence to indicate that the three-dimensional organization 
of the FSHD region significantly contributes to the regulation of 
gene expression at 4q35 (Petrov et al., 2006; Pirozhkova et al., 
2008; Bodega et al., 2009). 

Recently, it has been proposed that the area immediately 
proximal to D4Z4 could play a role in FSHD (Lemmers et al., 
2007; Tsumagari et al., 2008). Interestingly, this area has been 
suggested to function as a nuclear matrix attachment region 
(MAR; Petrov et al., 2006). Matrix attachment was shown to be 
weakened on the contracted chromosome 4 in FSHD-derived 
myoblasts compared with controls, leading to a drastic alter- 
ation in chromatin loop domain organization (Fig. 4; Petrov 
et al., 2006). In particular, whereas a high number of D4Z4 re- 
peats maintain the organization of the repeat array and 4q35 
genes in two distinct chromatin loops, loosening of the MAR in 
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FSHD patients would bring the contracted repeats and 4q35 
genes into the same chromatin loop (Petrov et al., 2006). Ulti- 
mately, the presence of an enhancer at the 5 ' end of the D4Z4 
unit could cause inappropriate 4q35 gene de-repression in FSHD 
(Petrov et al, 2008). 

Chromosome conformation capture (3C) is a technique 
that identifies long distance intra- and inter-chromosomal inter- 
actions (Dekker et al., 2002). Using 3C, two groups have inde- 
pendently investigated the higher order chromatin organization 
of the 4q35 locus (Pirozhkova et al., 2008; Bodega et al., 2009). 
Pirozhkova et al. (2008) showed that the telomeric 4qA allele 
is in close proximity to the promoters of the 4q35 genes FRG1 
and ANT1 in FSHD myoblasts and not in control myoblasts. 
The 4qA allele is immediately distal to the D4Z4 repeat array 
(Lemmers et al., 2002). Interestingly, an enhancer element was 
detected in the 4qA allele that could be involved in the reported 
up-regulation of FRG1 and ANT1 in FSHD (Gabellini et al., 
2002; Laoudj-Chenivesse et al., 2005; Pirozhkova et al., 2008). 
In the other 3C study, an interaction between D4Z4 and the 
FRG1 promoter was identified in human primary myoblasts that 
appears to be highly reduced upon myogenic differentiation 
(Bodega et al., 2009). Consistent with the observed mis-regulation 
of FRG1, a small but statistically significant reduction in the 
D4Z4-FRG1 promoter interaction was observed in FSHD myo- 
blasts compared with controls (Bodega et al., 2009). Alto- 
gether, it appears that in healthy subjects, the FRG1 promoter 
is in close proximity with the D4Z4 repeat and the gene is 
repressed, whereas in FSHD patients the promoter of FRG1 
is in close proximity with the 4qA marker and the gene is up- 
regulated (Gabellini et al., 2002; Pirozhkova et al., 2008; Bodega 
et al, 2009). 

CTCF is a multifunctional DNA-binding protein that is 
important for transcriptional regulation, chromatin insula- 
tion, and chromatin organization (Filippova, 2008). The same 
80-bp D4Z4 element mediating the perinuclear positioning 
of the 4q telomere is also responsible for the CTCF and A-type 
lamin-dependent transcriptional insulator function of the re- 
peat (Ottaviani et al., 2009a). The CTCF binding and insulation 
activity are lost upon multimerization of the repeats (Ottaviani 
et al., 2009a). As such, it has been proposed that FSHD patients 
have a CTCF gain-of-function phenotype that "protects" certain 
genes from the influence of nearby repressive chromatin, ul- 
timately generating a 4q35 de-repressed state (Fig. 4; Ottaviani 
et al., 2009a). 

Clearly, the FSHD locus is organized into a higher order 
chromatin structure that undergoes dynamic remodeling. The 
3D architecture of the region appears to play a fundamental role 
in regulating 4q35 chromatin status and gene expression, sug- 
gesting that defects in the organization of the epigenome of the 
FSHD region could underlie this disease. 

FSHD candidate genes 

The studies aimed at understanding the molecular basis of FSHD 
indicate that an in cis alteration likely leads to the de-repression 
of target gene(s). This model provides a valid explanation for 
the 4q specificity and the autosomal-dominant transmission of 
the disease (Tupler and Gabellini, 2004). 



The 4q35 locus is a relatively gene-poor region (van Geel 
et al., 1999; Blair et al., 2002). Of the genes that have been 
identified at 4q35, those of particular interest are ANT1, FRG1, 
DUX4c, FRG2, and DUX4 (Li et al., 1989; van Deutekom et al., 
1996b; Gabriels et al, 1999; Rijkers et al., 2004; Bosnakovski 
et al., 2008b). We will focus here on the two main candidate 
genes, DUX4 and FRG1. We will not discuss the other genes in 
detail because they are less attractive candidates. For example, 
DUX4c and FRG2 were found to be deleted in some FSHD 
families (Fig. 2; Lemmers et al., 2003), suggesting that they are 
not necessary for disease onset. Nevertheless, these genes are 
present in most of the affected families and it is possible that 
they could contribute to the penetrance and severity of the disease. 
There is some experimental support for this idea for DUX4c 
(Bosnakovski et al., 2008b; Ansseau et al., 2009). 

DUX4. Although the DUX4 gene contains an ORF en- 
coding a putative double homeobox protein named DUX4, the 
D4Z4 repeat was initially considered to be a nonprotein coding 
sequence due to lack of evidence of transcription and protein 
synthesis. Nonetheless, both DUX4 mRNA and protein were 
recently detected in FSHD-derived primary myoblasts but not 
in controls, suggesting that D4Z4 may directly affect disease 
progression through the aberrant production of DUX4 (Dixit 
et al., 2007). Because the D4Z4 repeat does not contain a 
canonical polyadenylation signal, the mRNA is generated exclu- 
sively by transcription of the last, most distal, unit of the array 
that extends to a region named pLAM, which contains a poly- 
adenylation signal (Dixit et al., 2007). It was recently shown that 
the pLAM polyadenylation signal can stabilize DUX4 tran- 
scripts that are ectopically expressed in transfected C2C12 cells 
(Lemmers etal., 2010b). This pLAM sequence also contains a 
polymorphism that could affect the polyadenylation of the dis- 
tal DUX4 transcript (Lemmers et al., 2010b). Because a group 
of analyzed FSHD1 patients were all found to carry the same 
SNP in this region, it was suggested that this polymorphism 
contributes to the selective stabilization of DUX4 transcripts in 
FSHD (Lemmers et al., 2010b). It should be noted, however, 
that in this study the permissive 4qA161 allele was compared only 
to nonpermissive 4qB alleles or lOqA chromosomes (Lemmers 
et al., 2010b). Non-permissive 4qA variants, such as the previ- 
ously described 4qA166 (Lemmers et al., 2007; de Greef et al., 
2009), were not analyzed. Hence, the available data do not allow 
us to exclude the possibility that the described variations reflect 
4qA/B or 4q/10q differences. 

The DUX4 pre-mRNA can be alternatively spliced (Snider 
et al., 2009). Interestingly, it was recently reported that muscles 
of healthy subjects express low levels ofaDUX4 splicing isoform 
encoding for a truncated protein, whereas muscles of FSHD pa- 
tients express a splicing isoform encoding for full-length DUX4 
(Snider et al., 2010). It has also been reported that the repeat array 
displays a complex transcriptional profile that includes sense and 
antisense transcripts and RNA processing (Snider et al., 2009). 
Thus, there may be multiple D4Z4-derived RNA players in FSHD 
and future work will be required to determine their functions. 

Recently, an isogenetic screen to assess the effect of over- 
expressing FSHD candidate genes ANT1, FRG1, FRG2, DUX4c, 
and DUX4 on cell viability was used (Bosnakovski et al., 2008a). 
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Previous work demonstrated a pro-apoptotic function for DUX4 
(Kowaljow et al., 2007), and DUX4 overexpression was found to 
have a dramatically toxic effect. As a consequence of increased 
DUX4 levels, the expression of oxidative stress response genes 
and of myogenic factors (such as MyoD and Myf5) was altered. 
Interestingly, a myogenic defect has been reported in FSHD 
patients (Winokur et al., 2003a; Celegato et al., 2006). 

DUX4 homeodomains are similar to those of Pax3 and 
Pax7, which are transcription factors that are pivotal to muscle 
function (Buckingham, 2007). Additionally, the phenotype of 
DUX4 overexpression can be rescued by overexpression of Pax3 
or Pax7 (Bosnakovski et al., 2008a). 

Recently, DUX4 overexpression (even at extremely low 
levels) was reported to cause massive apoptosis and severely 
abnormal development in Xenopus laevis, a model for verte- 
brate development (Wuebbles et al., 2010). Note that an increase 
in apoptosis is not generally considered to be a phenotype of 
FSHD (Winokur et al., 2003b). 

In theory, DUX4 represents an attractive FSHD candidate 
gene, as it would easily explain the requirement for at least one 
D4Z4 for the development of the disease. Due to its extreme 
toxicity, however, DUX4 could only function in FSHD if it is ab- 
sent under normal conditions and overexpressed exclusively in the 
muscle cell precursors of FSHD patients (Wuebbles et al., 2010). 

FRG 1 . FSHD region gene 1 (FRG1) is highly conserved in 
both vertebrates and invertebrates (Grewal et al., 1998). FRG1 
is considered a likely candidate gene for mediating FSHD due 
to its selective chromosomal location at 4q35, its overexpres- 
sion in FSHD samples (Gabellini et al., 2002; Bodega et al., 
2009), and the development of FSHD-like phenotypes follow- 
ing its overexpression in mice, Xenopus laevis, and Caenorhab- 
ditis elegans (Gabellini et al., 2006; Hanel et al., 2009; Wuebbles 
et al., 2009; Liu et al., 2010). Transgenic mice overexpressing 
FRG1 selectively in the skeletal muscle develop pathologies 
with physiological, histological, ultrastructural, and molecular 
features that resemble those of FSHD patients (Gabellini et al, 
2006). Nevertheless, FRG1 overexpression in FSHD patients 
is currently controversial, and studies have reported incon- 
sistent results (Gabellini et al., 2002, 2006, Dixit et al., 2007, 
Osborne et al., 2007; Bodega et al., 2009; Klooster et al., 2009). 
The idea that FRG1 plays an important role in FSHD was very 
recently challenged by the identification of an unusual FSHD 
patient with deletion of D4Z4 only on chromosome 10 (Lemmers 
et al., 2010b). However, as stated above, before discarding a 
causative role for FRG1 in FSHD, this patient requires further 
characterization. As discussed below, FRG1 is crucial for proper 
muscle function and vascular development (Gabellini et al., 
2006; Hanel et al., 2009; Wuebbles et al., 2009). Hence, FRG1 
could at minimum affect the severity of the disease. 

To better understand the role of FRG1 in FSHD, efforts 
have been made to characterize its biological function. The 
human endogenous FRG1 protein copurifies with the spliceo- 
some, the protein-RNA macromolecular complex responsible 
for pre-mRNA splicing (Kim et al., 2001; Rappsilber et al., 
2002; Bessonov et al., 2008). When ectopically overexpressed, 
FRG1 localizes to nucleoli, Cajal bodies, and speckles (van 
Koningsbruggen et al., 2004), and colocalizes and/or interacts with 



proteins involved in RNA biogenesis, such as SMN, PABPN1, and 
FAM71B (van Koningsbruggen et al., 2007). Interestingly, mu- 
tations in SMN and PABPN1 cause myopathies (Calado et al., 
2000; Briese et al., 2005). Together, these results suggest that 
FRG1 functions in RNA processing. Indeed, in FSHD and in 
transgenic mice or cells overexpressing FRG1, altered splicing 
has been observed for a number of genes (Gabellini et al., 2006; 
van Koningsbruggen et al., 2007; Davidovic et al., 2008). 

Specific muscle-related functions for FRG1 have been 
identified in different animal models (Hanel et al., 2009; Liu 
et al., 2010). Interestingly, overexpression of Xenopus frgl or 
C. elegans FRG-1 causes a muscle defect (Hanel et al., 2009; 
Liu et al., 2010). 

FRG1 contains a single fascin-like domain, a motif that is 
associated with actin-bundling properties (Edwards and Bryan, 
1995), and it was recently shown that FRG1 can bind F-actin 
and promote its bundling (Liu et al., 2010). In agreement with 
this finding, in different organisms the endogenous FRG1 in 
muscle has not only a nuclear distribution but is also a sarco- 
meric protein, suggesting that FRG1 might perform a muscle- 
specific function (Hanel et al., 2010; Liu et al., 2010). 

FSHD pathology is most prominent in the musculature; 
however, up to 75% of FSHD patients display retinal vasculopathy 
(Fitzsimons et al., 1987; Padberg et al., 1995). Intriguingly, it 
was recently reported that in human tissues the endogenous 
FRG1 is strongly expressed in arteries, veins, and capillaries 
(Hanel et al., 2010). Moreover, in Xenopus FRG1 levels are 
crucial for proper vascular development, and up-regulation 
of FRG1 leads to a disrupted vascular phenotype (Wuebbles 
et al., 2009). 

Collectively, these studies indicate that FRG1 overexpres- 
sion in different animal models is associated with aberrant mus- 
cle structure and vasculature, the two most prominent features 
of FSHD pathology. 

Conclusions 

Recent studies suggest that copy number variations (CNVs) are 
important for human phenotypic diversity and disease suscepti- 
bility. DNA repeats account for 55% of the human genome and 
a significant fraction of CNVs. 

FSHD is an important pathology caused by CNVs of 
D4Z4 repeats. It is an extremely complicated and fascinating 
disease, and research into this topic is revealing much about the 
functional organization of our genome. 

An increasing amount of evidence suggests that the 4q35 
macrosatellite repeat D4Z4 plays a crucial role in the chromo- 
somal organization of the FSHD region. There is a general con- 
sensus that the D4Z4 deletion in FSHD leads to epigenetic 
alterations that affect the expression profiles of genes within the 
FSHD region. Unfortunately, despite considerable effort, almost 
20 years after the identification of the genetic defect underlying 
the disease, the causative FSHD gene(s) remains unknown, and 
no effective treatments for FSHD are currently available. 

The heterogeneity in disease manifestation probably re- 
flects heterogeneity in gene expression in FSHD. An interesting 
possibility, therefore, is that the complexity of FSHD could be 
explained by considering it to be a contiguous gene syndrome, 
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where the epigenetic alteration of DUX4, FRG1, and other 
potential genes collaborate to determine the final phenotype. 
Finally, because DUX4 behaves as a transcriptional activator 
(Dixit et al., 2007), it could play a direct role in transcriptional 
overexpression of the other 4q35 genes, providing a unifying 
model for the molecular mechanism of the disease. 
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